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Abstract 

It is known that the number of overlap-free binary words of length n grows poly- 
nomially, while the number of cubefree binary words grows exponentially. We show 
that the dividing line between polynomial and exponential growth is | . More precisely, 
there are only polynomially many binary words of length n that avoid |-powers, but 

7 + 

there are exponentially many binary words of length n that avoid ^ -powers. This 
answers an open question of Kobayashi from 1986. 

1 Introduction 

We are concerned in this paper with problems on combinatorics of words ^21 ^] • 

Let S be a finite nonempty set, called an alphabet. We consider finite and infinite words 
over S. The set of all finite words over E is denoted by S*. The set of all infinite words 
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(that is, maps from N to S) is denoted by E'^. In this paper we often use a particular class 
of alphabets, namely 

E,:={0,1,...,A;-1}. 

A morphism is a map h : H* A* such that h{xy) = h{x)h{y) for all x,y ^ S*. A 
morphism may be specified by providing the image words h{a) for all a G E. If : S* — S* 
and h{a) = ax for some letter a G S, then we say that h is prolongable on a, and we can 
then iterate h infinitely often to get the fixed point /i'^(a) := axh{x) h'^{x) h^{x) ■ ■ ■ . If there 
exists an integer k such that the morphism h satisfies h{a) = k for all a G S, we say it is 
k-uniform. If a morphism is /c-uniform for some k, then we say it is uniform. For more on 
morphisms, see, for example, |3]. 

A square is a nonempty word of the form xx, as in the English word murmur. A cube 
is a nonempty word of the form the Finnish word kokoko. An overlap is a word 

of the form axaxa, where x is a possibly empty word and a is a single letter, as in the 
English word alfalfa. A word f is a factor (sometimes called a subword) of a word x 
if X can be written x = uvw for some words u,w. A word avoids squares (resp., cubes, 
overlaps) if it contains no factor that is a square (resp., cube, overlap). Such words are also 
called squarefree (resp., cubefree, overlap-free). For example, the English word square is 
squarefree, whereas squarefree is not. 

It is well-known and easily proved that every word of length 4 or more over a two- 
letter alphabet contains a square as a factor. However, Thue proved in 1906 ^Hl that there 
exist infinite squarefree words over a three- letter alphabet. Thue also proved that the word 
H'^{0) = 0110100110010110 ■ ■ ■ is overlap-free (and hence cubefree); here yU is the Thue-Morse 
morphism sending ^ 01 and 1 10. 

Dejean jHj initiated the study of fractional powers. Let a be a rational number > 1. 
Following Brandenburg 0, we say that a nonempty word w is an a-power if there exist 
words y,y' G S* such that w = y"'y', and y' is a prefix of y with n + \y'\/\y\ = o.. For 
example, the English word alfalfa is a |-power and the word ionization is a y-power. If 
a is a real number, we say that a word w avoids a-powers (or is a-power-free) if it contains 
no factor that is a /3-power for any rational (3 > a. We say that a word w avoids a+-powers 
(or is a"^-power-free) if it contains no factor that is a /5-power for rational (3 > a. Thus a 
word is overlap-free iff it is 2"''-power-free. 

We may enumerate the number of words avoiding various patterns. Brandenburg |2] 
proved (among other things) that there are exponentially many cubefree binary words; also 
see Edlin • Restivo and Salemi US] proved that there exist only polynomially many 
overlap-free binary words of length n; in fact they gave an upper bound of 0(n'°^2i5) xhe 
exponent log2 15 was improved to 1.7 by Kfoury [8^ to 1.587 by Kobayashi ITU], and to 1.37 
by Lepisto [TT]. Also see Cassaigne P] 

Overlap- free words avoid 2+-powers, and there are only polynomially many over S2. 
Cubefree words avoid 3-powers and there are exponentially many over E2. Kobayashi |U1 
Problem 6.6] asked the following natural question: at what exponent a (if any) does the 
number of binary words avoiding a-powers jump from polynomial to exponential? In this 
paper we prove that the answer is |. Our proof uses the fact that various structure theorems. 
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which previously were known for overlap-free words, also hold for any exponent a with 
2<a<l. 



2 Preliminary lemmas 

We begin with some notation and preliminary results. We write = 1 and 1 = 0. We let fi 
be the Thue-Morse morphism mentioned in the previous section. 

Lemma 1 Let t, f G Sg. // there exist c, G S2 such that cfi{t) = fi{v)d, then d = c, t = c^, 
and V = c", where n = \t\ = \v\. 

Proof. See P Lemma 1.7.2]. ■ 

Lemma 2 Suppose t,?/ G S2 and fi{t) = yy. Then there exists f G S2 such that y = fi{v). 
Proof. See jH Lemma 1.7.3]. ■ 

Lemma 3 Let h : Ti* —>■ H* be a uniform morphism, and let a be a rational number. If w 
contains an a-power, then h{w) contains an a-power. 

Proof. Suppose w contains an a-power. Then there exist words s, s' G S"^ and r, t G S* 
such that w = rs'^s't, where s' is a nonempty prefix of s and n + |s'|/|s| = a. Then 
h{w) = h{r)h{s)'^h{s')h{t). Then h{w) contains the a-power h{sy"'h{s'). ■ 

Note that for arbitrary morphisms the result need not be true (unless a is an integer). 

Lemma 4 Let w G S2, and suppose fi{w) contains an a-power. Then w contains a (3 -power 
with (3 > a. 

Proof. Suppose contains an a-power, say ii{w) = xy^y'z, where n + \y'\/\y\ = a. 

There are four cases to consider, based on the parity of |x| and \y\. 

Case 1: |x| is even and \y\ is even. There are two subcases, depending on the parity of 



\y\- 

Case la: \y'\ is even. Then \z\ is even. Then there exist words r,s,s',t, with s' a prefix 
of s, such that /i(r) = x, fi{s) = y, /i(s') = y', and fi{t) = z. Then w = rs"'s't, and so w 
contains the a-power s^s'. 

Case 2a: \y'\ is odd. Then \z\ is odd. Then there exist words r, s, s',t, with s' a prefix of 
s, and a letter c such that /i(r) = x, ^{s) = y, ii{s')c = y', and c/i(t) = z. Since \y'\ is odd, 
\y\ is even, and y' is a prefix of y, it follows that y'c is also a prefix of y. Hence s'c is a prefix 
of s. Then w contains the /5-power s^s'c, where 

\s'c\ 2\s'\+2 \y'\ + 1 \y'\ 

p = n + — - = n H — = n H — > n + —— > a. 

\s\ 2|s| \y\ \y\ 



Case 2: |x| is even and \y\ is odd. Then there exists a word t such that = yy. From 
Lemma 121 there exists v such that y = ^{v). But then \y\ is even, a contradiction. Thus this 
case cannot occur. 

Case 3: |x| is odd and \y\ is even. There are two subcases, depending on the parity of 

\y'\- 

Case 3a: \y'\ is even. Then \z\ is odd. Then there exist words r,s,s',t and letters c,d,e 
such that X = fi{r)c, y = cfi{s)d, y' = dfi{s')e, and z = efi{t). Consideration of the factor 
yy gives c = d. Hence ^{w) = fi{r{cs)^cs'et) and so w = r{cs)'^cs'et. Thus w contains the 
a-power (cs)"cs', and since s' is a prefix of s, it follows that cs' is a prefix of cs. 

Case 3b: \y'\ is odd. Then \z\ is even. Then we are in the mirror image of case 2a, and 
the same proof works. 

Case 4: |x| is odd and \y\ is odd. Then from length considerations we see that there exist 
words t, V and letters c, d such that y = cfi{t) = jj,{v)d. By Lemma [H we have c? = c, t = c", 
V = c". Thus y = c(cc)". Since y' is a nonempty prefix of y, we may write fi{w) = xy'^ct for 
some word t. Since and \y\ are odd, and y ends in c, we must have that cc is the image 
of letter under /i, a contradiction. Thus this case cannot occur. ■ 

Theorem 5 Let w G and let a > 2 be a real number. Then w is a-power-free iff fi{w) 
is a-power-free. 

Proof. Combine Lemmas El and 0] ■ 

We remark that Theorem El is not true if a = 2; for example w = 01 contains no square, 
but fj,{w) = 0110 does. 

3 A structure theorem for a-power-free words for 2 < 

Restivo and Salemi proved a beautiful structure theorem for overlap-free binary 

words. Roughly speaking, it says that any overlap-free word is, up to removal of a short 
prefix or suffix, the image of another overlap-free word under n, the Thue-Morse morphism. 
Perhaps surprisingly, the same sort of structure theorem exists for binary words avoiding 
a-powers, where a is any real number with 2 < a < |. 

Theorem 6 Let x be a word avoiding a-powers, with 2 < a < |. Let fi be the Thue-Morse 
morphism. Then there exist u,v,y with u,v & {^j 0, 1,00, 11} and a word y G S2 avoiding 
a-powers, such that x = ufi{y)v. 

Proof. We prove the result by induction on If |x| < 2, then the factorizations can be 
chosen as shown in the following table. 
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Now suppose the claim is true for all x with \x\ < k. We prove it for |x| = k. Let x be 
a-power-free with |x| > 3. Write x = az with a G S2 and z G Sg. Since x is a-power-free, 
so is z. Since 1^1 < |x|, by induction there exist u',v' G {e, 0, 1, 00, 11} and a a-power-free 
word y' such that z = u'fi{y')v'. 

Now there are several cases to consider. 

Case 1: u' = e or u' = a. Then we may write x = ufi{y)v with [u, y, v) = {au', y', v'). 

Case 2: u' = a. Then x = ufi{y)v with {u,y,v) = {e,ay',v'). Since x is a-power-free, so is 
^{ay'), and hence, by Theorem so is ay'. 

Case 3: u' = aa. Then x begins with aaa = a^, and so x does not avoid a-powers. 

Case 4: u' = aa. Then x = aaafi{y')v'. 

Case 4. a: \y'\ = 0. Then x = aaav'. If f ' = e (resp., v' = a, v' = aa), then we can 
write X = ufi{y)v with {u,y,v) = (e, a,a) (resp., {u,y,v) = (e, aa, e), {u,y,v) = 
(e, aa, a). Otherwise, if f ' = a or v' = a a then x contains aaa = a'^, and so x 
does not avoid a-powers. 

Case 4.b: \y'\ > 1. There are two cases to consider. 

Case 4.b.i: y' = ay". There are several cases to consider. 

Case 4.b.i.l: \y"\ = 0. Then y' = a and x = aaaaav'. 

If = e (resp., v' = a, v' = aa, v' = a), then we can write x = 
ufi{y)v with {u,y,v) = (e,aa, a) (resp., {u,y,v) = (e, aaa, e), {u,y,v) = 
{e,aaa,a), {u,y,v) = {e,aa,aa)). Otherwise, if v' = aa, then x contains 
aaa = a^, and so x does not avoid a-powers. 

Case 4.b.i.2: \y"\ > 1. If y" = ay'", then x = aaaaaaaix{y"')v' , so x contains 
the |-power aaaaa. If y" = ay"', then x = aaaaaaafi{y'")v', so x contains 
the |-power aa aaaaa. 

Case 4.b.ii: y' = ay". Then x = aaaaafi{y")v'. Thus x contains aaa = a^. 
Our proof by induction is now complete. 
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The decomposition in Theorem IHl is actually unique if |x| > 7. As this requires more 
tedious case analysis and is not crucial to our discussion, we do not prove this here. 

We also note that the role of | in TheoremlHlis crucial, since no word ■ ■ ■ 0110110 ■ ■ ■ can 
be factorized in the stated form. 

4 Polynomial upper bound on the number of |-power- 
free words 

Theorem IHl has the following implication. Let x = xq he a nonempty binary word that is 
a-power-free, with 2 < a < |. Then by Theorem |B1 we can write Xq = Uifi{xi)vi with 
1^1 1, |fi| < 2. If |xi| > 1, we can repeat the process, writing xi = U2iJi{x2)v2. Continuing in 
this fashion, we obtain the decomposition Xi = Ui^{x,i)Vi until \xt+i\ = for some t. Then 

Then from the inequalities 1 < \xt\ < 4 and 2\xi\ < < 2\xi\ + 4, 1 < i < t, an easy 

induction gives 2* < |x| < 2^"^^ — 4. Thus t < logg |x| < t + 3, and so 

log2 |x| - 3 < t < log2 (1) 

There are at most 5 possibilities for each Ui and Wj, and there are at most 22 possibilities 
for Xt (since 1 < \xt\ < 4 and Xt is a-power-free). Inequality (P) shows there are at most 3 
possibilities for t. Letting n = \x\, we see there are at most 3 ■ 22 ■ 5^'°S2" = 66n^°S2 25 -^^grds 
of length n that avoid a-powers. We have therefore proved 

Theorem 7 Let 2 < a < |. There are 0{n}'^^^^^) = 0{n^'^*^) binary words of length n that 
avoid a-powers. 

We have not tried to optimize the exponent in Theorem [7| Probably it can be made 
significantly smaller. 

7 + 

5 Exponential lower bound on the number of ^ -power- 
free words 

In this section we prove that there are exponentially many binary words of length n avoiding 

7 + 

^ -powers. 

Define the 21-uniform morphism /i : S4 as follows: 

h{0) = 011010011001001101001 

h{l) = 100101100100110010110 

h{2) = 100101100110110010110 

h{3) = 011010011011001101001. 
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We first show 



Lemma 8 Let w be any squarefree word over S4. Then 
(i) h{w) contains no square yy with \y\ > 13; and 

(a) h{w) contains no '^^ -powers. 



Proof. We first prove (i). We argue by contradiction. Let w = 0102 • • • be a squarefree 
word such that h{w) contains a square, i.e., h{w) = xyyz for some z e E4, y G S4. 
Without loss of generahty, assume that w is a shortest such word, so that < \z\ < 21. 

Case 1: \y\ < 42. In this case we can take \w\ < 5. To verify that h{w) contains no 
squares yy with \y\ > 13, it therefore suffices to check the image of each of the 264 squarefree 
words in II4. 

Case 2: \y\ > 42. First, we observe the following facts about h. 

Fact 9 (i) Suppose h{ab) = th{c)u for some letters a,b,c & E4 and words t, m G Eg. Then 
this inclusion is trivial (that is, t = e or u = e) or u is not a prefix of h{d) for any 
d G E4. 

(a) Suppose there exist letters a,b,c G S4 and words s,t,u,v G Eg such that h{a) = st, 
h{b) = uv, and h{c) = sv. Then either a = c or b = c. 



(i) This can be verified with a short computation. 

(ii) This can also be verified with a short computation. If |s| > 11, then no two images 
of distinct letters share a prefix of length 11. If \s\ < 10, then \t\ > 11, and no two 
images of distinct letters share a suffix of length 11. 

■ 

Now we resume the proof of Lemma |H1 For i = 1,2, ... ,n define Ai = h{ai). Then if 
h{w) = xyyz, we can write 



Proof. 



h{w) = A1A2 



An = A[A';A2 ■ ■ ■ A,.,ArA''A,+^ ■ ■ ■ An-iA'^A 



n 



where 




y 



a';a. 



■ Aj_iA'j — AjAj^i ■ ■ ■ An-iA'^ 



z 
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and \A'I\, \A''\ > 0. See Figure □ 
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Figure 1: The word xyyz within h{w) 



If \A'-[\ > \Aj\, then Aj^i = h{aj^i) is a factor of A'IA2, hence a factor of ^41^42 = h{aia2)- 
Thus we can write Aj+2 = ^^4.2^^+2 with 



See Figure El 
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Figure 2: The case \A'I\ > \A''\ 



\Aj\), or A'j_^_2 is a not 



But then, by Fact|i(i), either \A'^\ = 0, or \A'j_^2\ = (so \A'I\ 
a prefix of any h{d). All three conclusions are impossible. 

If \A'I\ < \A'j\, then A2 = h{a2) is a factor of A'^Aj^i, hence a factor of AjAj^i 
Thus we can write ^3 



h(aja 



A'^A'.^ with 



A'IA2K 



See Figure El 
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A'n 



Figure 3: The case \A'I\ < 

By FactH (i), either \A'I\ = or \A'^\ =0 (so \AI\ = \A'-\) or A'^ is not a prefix of any 
h{d). Again, all three conclusions are impossible. 

Therefore \A'I\ = \A]\. Hence A'I = AJ, A2 = Aj+i, . . ., Aj.^ = and A'j = A'^. 

Since h is injective, we have 02 = dj+i, ■ ■ ■ , dj-i = dn-i- It also follows that \y\ is divisible 
by 21 and Aj = A'-A'^ = A!^K- But by Fact H (ii), either (1) = an or (2) aj = Oi. 
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In the first case, 02 ■ ■ ■ aj_iaj = a^+i ■ ■ ■ a„_ia„, so w contains the square (02 ■ ■ ■ aj_iaj)'^, 
a contradiction. In the second case, ai ■ ■ -a-j-i = ajOj+i ■ ■ -an-i, so w contains the square 
contradiction. 
This completes the proof of part (i). 

It now remains to prove (ii). If h{w) contains a | -power yyy', then it contains a square, 
and by part (i) we know that \y\ < 13. We may assume that \y'\ < [|-13] = 5, so \yyy'\ < 31. 
Hence we need only check the image of all 36 squarefree words in S4 to ensure they do not 
contain any | -power. We leave this computation to the reader. 
■ 

Next, define the substitution (7 : S3 — > 2^4 as follows: 

9(0) = {0,3} 
9(1) = {1} 
9{2) = {2}. 

Let \w\a denote the number of occurrences of the letter a in the word w. We prove 

Lemma 10 Let w G S3 be any squarefree word. Then h{g{w)) is a language of 2"^ words 
over S2, where r = \w\o, and moreover these words are of length 2l\w\ and avoid | -powers. 

Proof. Let w be a squarefree word over S3. Then g{w) is a language over S4, and 
we claim that each word x G g{w) is squarefree. For suppose x G g{w) and x contains a 
square, say x = tuuv for some words t, f G S4, and u G S^. Define the morphism / where 
/(O) = /(3) = 0, /(I) = 1, and /(2) = 2. Then /(x) = w and /(x) contains the square 
f(u)f{u), a contradiction. It now follows from Lemma|Hlthat x avoids ^'''-powers. ■ 

Finally, we obtain 

Theorem 11 Let Cn be the number of binary words of length n that are '^^ -power-free. 
Then C„ = VL{j'^), where 7 = 2^'^^ = 1.011. 

Proof. Take any squarefree word x of length m over S3. There must exist a symbol a G S3 
such that a occurs at least [m/3] times in x. By replacing each symbol 6 in x with b — a 
(mod 3), we get a squarefree word x' with at least [m/3] occurrences of 0. 

Now consider h(g(x')). We get at least 2™/^ words of length 21m, and each word is | - 
power-free. Write n = 21m — k, where < /c < 21. By what precedes, there are at least 2"/^'^ 
words of length 21m that are |'''-power-free. Thus there are at least 2~'^2"/^^ > 2~^^2"/^'^ 
words of length n with the desired property. ■ 

We have not tried to optimize the value of 7. It can be improved slightly in several ways: 
for example, by starting with a squarefree word over S3 with a higher proportion of O's; see 

ini- 
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For an upper bound on C„, we may reason as follows: if w is a word avoiding |^-powers, 
then w certainly has no occurrences of either 000 or 111. Let denote the number of binary 
words of length n avoiding both 000 and 111. Then C„ < En and it is easy to see that 

En = En-l + En-2 (2) 

for n > 3. Now the characteristic polynomial of the recursion Q is — x — 1, and the 
dominant zero of this polynomial is (1 + v^)/2 = 1.62. By well-known properties of linear 
recurrences we get En = 0(1.62"). 

This procedure may be automated. Noonan and Zeilberger il4| have written a Maple 
package DAVID_IAN that allows one to specify a list L of forbidden words, and computes the 
generating function enumerating words avoiding members of L. We used this package for a 
list L of 58 words of length < 24: 

000, 111, 01010, 10101, . . . , 110110010011011001001101 

including words of the form x^^'^^'^^^^^^^^^^^^ for 1 < |x| < 10. (Words for which shorter 
members of L are factors can be omitted.) We obtained a characteristic polynomial of 
degree 39 with dominant root 1.22990049 ■ ■ ■ . Therefore we have shown 

Theorem 12 The number Cn of binary words of length n avoiding -powers satisfies Cn = 
0(1.23"). 

6 Avoiding arbitrarily large squares 

Dekking jH] proved that every infinite overlap-free binary word must contain arbitrarily large 
squares. He also proved that there exists an infinite cubefree binary word that avoids squares 
XX with |x| > 4. Furthermore the number 4 is best possible, since every binary word of length 
> 30 contains a cube or a square xx with \x\ > 3. 

This leads to the following natural question: what is the largest exponent a such that 
every infinite a-power-free binary word contains arbitrarily large squares? From Dekking's 
results we know 2 < « < 3. The answer is given in the following theorem. 

Theorem 13 (i) Every infinite ^-power-free binary word contains arbitrarily large squares. 

(a) There exists an infinite '^^ -power-free binary word such that each square factor xx 
satisfies \x\ < 13. 

Proof. For (i), let w be an infinite |-power-free binary word. By Theorem IHl and Eq. (^, 
any prefix of w of length 2""*"^ contains /x""'"^(0) as a factor. But /i""'"^(0) = /i"(0110), so any 
prefix of length 2"+^ contains the square factor xx with x = /i"'(l). 

For (ii), from Theorem |S1 it follows that if w is an infinite squarefree word over E4, and 
h is the morphism defined in § El then h{w) has the desired properties. ■ 

We note that the number 13 in Theorem^] (ii) is not best possible. A forthcoming paper 
examines this question in more detail. 
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7 Numerical Results 



Let An (resp., Bn, Cn, Dn) denote the number of overlap-free words (resp., |-power-free 
words, l^-power-free words, cubefree words) over the alphabet S2. We give here the values 
of these sequences for < n < 28. 
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1 
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4 


5 
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9 


10 


11 


12 


13 


14 


15 


16 




1 


2 


4 


6 


10 


14 


20 


24 


30 


36 


44 


48 


60 


60 


62 


72 


82 


Bn 


1 


2 


4 


6 


10 


14 


20 


24 


30 


40 


48 


56 


64 


76 


82 


92 


106 


Cn 


1 


2 


4 


6 


10 


14 


20 


30 


38 


50 


64 


86 


108 


136 


178 


222 


276 


Dn 


1 


2 


4 


6 


10 


16 


24 


36 


56 


80 


118 


174 


254 


378 


554 


802 


1168 



n 


17 


18 


19 


20 


21 


22 


23 


24 


25 


26 


27 


28 


An 


88 


96 


112 


120 


120 


136 


148 


164 


152 


154 


148 


162 


Bn 


124 


142 


152 


172 


192 


210 


220 


234 


256 


284 


308 


314 


Cn 


330 


408 


500 


618 


774 


962 


1178 


1432 


1754 


2160 


2660 


3292 


Dn 


1716 


2502 


3650 


5324 


7754 


11320 


16502 


24054 


35058 


51144 


74540 


108664 
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